Fix subnet unsubscription time#6890
Conversation
michaelsproul
left a comment
There was a problem hiding this comment.
Nice find!
The fix makes sense to me.
The testnet case is interesting, but I think a fix for that can wait for another PR. I guess we might want to special-case the VC's/BN's handling of the first ~2 slots of a network (feels kind of dumb), or remove the loud warnings for late subscriptions from the BN and just track them with a metric.
|
Looks like one of the tests is failing. It might actually be expecting the old behaviour 👀 |
|
I fixed the failing test according to my understanding. After the refactor, we decide to extend the unsubscription only in TL;DR: the test |
| let current_slot = self.beacon_chain.slot_clock.now().unwrap_or_default(); | ||
| if let Err(e) = self.subscribe_to_subnet_immediately(subnet, current_slot + 1) { | ||
| let ExactSubnet { subnet, slot } = exact_subnet; | ||
| if let Err(e) = self.subscribe_to_subnet_immediately(subnet, slot + 1) { |
There was a problem hiding this comment.
You should add some comments referencing the issue and the reason for slot + 1 in the code. Useful if we ever refactor this and git blame breaks
1f7c612 to
53fb5ea
Compare
|
This pull request has been removed from the queue for the following reason: The merge conditions cannot be satisfied due to failing checks: You should look at the reason for the failure and decide if the pull request needs to be fixed or if you want to requeue it. If you want to requeue this pull request, you need to post a comment with the text: |
|
Fix for CI failure: |
AgeManning
left a comment
There was a problem hiding this comment.
Yeah nice catch.
I wouldn't have imagined this would change the density of subscribed subnets between 5.3 and 6.0?
|
@mergify requeue |
✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically |
Looks like this was introduced in #6146 which was part of the 6.0 release (My bad, I missed this in the review!). The old logic was good and same as the fix in this PR: |
Issue Addressed
Hopefully fixes #6732
Proposed Changes
In our
scheduled_subscriptions, we were setting unsubscription slot to becurrent_slot + 1. Given that we were subscribing to the subnet atduty.slot - 1, the unsubscription slot ended up beingduty.slot. So we were unsubscribing to the subnet at the beginning of the duty slot which is insane.Fixes the
scheduled_subscriptionsto unsubscribe atduty.slot + 1.Additional Info
Running this on kurtosis made me realise that we do not send subscriptions for the first couple of slots because of https://github.com/sigp/lighthouse/blob/unstable/validator_client/validator_services/src/duties_service.rs#L79-L85
Not sure if we want to handle that.